16 research outputs found

    Disambiguation of features for improving target class detection from social media text

    Get PDF

    Evaluation of Rule-Based Learning and Feature Selection Approaches For Classification

    Get PDF
    Feature selection is typically employed before or in conjunction with classification algorithms to reduce the feature dimensionality and improve the classification performance, as well as reduce processing time. While particular approaches have been developed for feature selection, such as filter and wrapper approaches, some algorithms perform feature selection through their learning strategy. In this paper, we are investigating the effect of the implicit feature selection of the PRISM algorithm, which is rule-based, when compared with the wrapper feature selection approach employing four popular algorithms: decision trees, na'ive bayes, k-nearest neighbors and support vector machine. Moreover, we investigate the performance of the algorithms on target classes, i.e. where the aim is to identify one or more phenomena and distinguish them from their absence (i.e. non-target classes), such as when identifying benign and malign cancer (two target classes) vs. non-cancer (the non-target class)

    Suicide related text classification with prism algorithm

    Get PDF
    Raw but valuable user data is continuously being generated on social media platforms. This data is, however, more valuable when they are mined using different approaches such as machine learning techniques. Additionally, this user-generated data can be used to potentially save lives especially of vulnerable social media users, as several studies carried out have shown the correlation between social media and suicide. In this study, we aim at contributing to the research relating to suicide communication on social media. We measured the performance of five machine learning algorithms: Prism, Decision Tree, Na¨ıveNa¨ıve Bayes, Random Forest and Support Vector Machine, in classifying suicide-related text from Twitter. The results of the study showed that the Prism algorithm has outperformed the other machine learning algorithms with an F-measure of 0.84 for the target classes (Suicide and Flippant). This result, to the best of our knowledge, is the highest performance that has been achieved in classifying social media suicide-related text

    Evaluation of rule-based learning and feature selection approaches for classification

    Get PDF
    Feature selection is typically employed before or in conjunction with classification algorithms to reduce the feature dimensionality and improve the classification performance, as well as reduce processing time. While particular approaches have been developed for feature selection, such as filter and wrapper approaches, some algorithms perform feature selection through their learning strategy. In this paper, we are investigating the effect of the implicit feature selection of the PRISM algorithm, which is rule-based, when compared with the wrapper feature selection approach employing four popular algorithms: decision trees, na'ive bayes, k-nearest neighbors and support vector machine. Moreover, we investigate the performance of the algorithms on target classes, i.e. where the aim is to identify one or more phenomena and distinguish them from their absence (i.e. non-target classes), such as when identifying benign and malign cancer (two target classes) vs. non-cancer (the non-target class)

    Text classification for suicide related tweets

    Get PDF
    Online social networks have become a vital medium for communication. With these platforms, users have the freedom to share their opinions as well as receive information from a diverse group of people. Although this could be beneficial, there are some growing concerns regarding its negative impact on the safety of its users such as the spread of suicidal ideation. Therefore, in this study, we aim to determine the performance of machine classifiers in identifying suicide-related text from Twitter (tweets). The experiment for the study was conducted using four popular machine classifiers: Decision Tree, Naive Bayes, Random Forest and Support Vector Machine. The results of the experiment showed an F-measure ranging from 0.346 to 0.778 for suicide-related communication, with the best performance being achieved using the Decision Tree classifier

    Text classification for suicide related tweets

    Get PDF
    Online social networks have become a vital medium for communication. With these platforms, users have the freedom to share their opinions as well as receive information from a diverse group of people. Although this could be beneficial, there are some growing concerns regarding its negative impact on the safety of its users such as the spread of suicidal ideation. Therefore, in this study, we aim to determine the performance of machine classifiers in identifying suicide-related text from Twitter (tweets). The experiment for the study was conducted using four popular machine classifiers: Decision Tree, Naive Bayes, Random Forest and Support Vector Machine. The results of the experiment showed an F-measure ranging from 0.346 to 0.778 for suicide-related communication, with the best performance being achieved using the Decision Tree classifier

    Suicide related text classification with prism algorithm

    Get PDF
    Raw but valuable user data is continuously being generated on social media platforms. This data is, however, more valuable when they are mined using different approaches such as machine learning techniques. Additionally, this user-generated data can be used to potentially save lives especially of vulnerable social media users, as several studies carried out have shown the correlation between social media and suicide. In this study, we aim at contributing to the research relating to suicide communication on social media. We measured the performance of five machine learning algorithms: Prism, Decision Tree, Na¨ıveNa¨ıve Bayes, Random Forest and Support Vector Machine, in classifying suicide-related text from Twitter. The results of the study showed that the Prism algorithm has outperformed the other machine learning algorithms with an F-measure of 0.84 for the target classes (Suicide and Flippant). This result, to the best of our knowledge, is the highest performance that has been achieved in classifying social media suicide-related text
    corecore